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ABSTRACT: The novel coronavirus SARS-CoV-2, the cause of N1-methylpseudouridine (m1¥) COVID-19 mRNA vaccine 

the COVID-19 pandemic, has inspired one of the most efficient L A 
HN~ ~N~ 


vaccine development campaigns in human history. A key aspect of 
COVID-19 mRNA vaccines is the use of the modified nucleobase 
N1-methylpseudouridine (m1) to increase their effectiveness. In 
this Outlook, we summarize the development and function of m1? 
in synthetic mRNAs. By demystifying how a novel element within 
these medicines works, we aim to foster understanding and 
highlight future opportunities for chemical innovation. Synthetic messenger RNA 
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ŒE INTRODUCTION vaccines can be found in the primary literature reporting them as 


36-8 

On December 11, 2020, the U.S. Food and Drug Administration well as several excellent reviews. , 

(FDA) issued the first emergency use authorization (EUA) for a The chemical components of mRNA vaccines are pleasantly 
vaccine to prevent COVID-19, a disease caused by the severe 
acute respiratory syndrome coronavirus 2 (SARS-CoV-2).* 
Approval of a second COVID-19 vaccine followed 1 week later.” 
These approvals represent a public health breakthrough, 
providing the first protective measures against the largest global 
pandemic to strike in over 100 years, and were the first fruit of a 
vaccine development process akin in scope and urgency to the 


unremarkable, consisting primarily of RNA plus “water, salt, 
sugar, and fat,” with two notable exceptions. The first is the lipid 
nanoparticles that encapsulate the mRNA and facilitate its 
delivery, which are excellently reviewed elsewhere.” The second 
is the non-natural RNA nucleobase N1-methylpseudouridine 
(m1¥; Figure 1b), which enhances immune evasion and protein 
production. In this Outlook, we briefly review the development 


famed Manhattan Project. These two vaccines are also notable and function of m1'¥ in synthetic mRNA. By demystifying how a 

for being the first FDA-approved therapeutics to use a novel critical component of these new medicines work, we hope to 

therapeutic platform: synthetic mRNA (mRNA). help foster their acceptance and highlight future areas for 
Messenger RNAs are used in every cell of our body, where chemical innovation. 


they serve the central relay between the instructions of the 

genome and protein production. Synthetic mRNAs tap into this M PRIMARY STRUCTURE OF THE COVID-19 MRNA 
same natural process but are designed to encode proteins with VACCINES 

therapeutic effects.’ The COVID-19 mRNA vaccines produce a 
full-length SARS-CoV-2 spike protein with two mutations 
(K986P and V987P) that ensure it remains in an antigenically 
favorable prefusion conformation. ** Upon injection, mRNA is 
taken up by muscle and infiltrating immune cells that use it to 
produce spike protein (Figure la). A transmembrane anchor 
causes the spike protein to be displayed on the cell surface, 
allowing it to be recognized by the immune system. This triggers 
the production of antibodies and T-cells that protect against 
natural infection and prevent serious disease. Since synthetic Received: February 10, 2021 
mRNAs produce only a single component of the SARS-CoV-2 Published: April 6, 2021 
genome, they cannot cause COVID-19. It is also important to 

note these vaccines are nonreplicating mRNAs that naturally 

decompose and do not integrate into genomes. Detailed 

descriptions of the development and characterization of these 


The two approved COVID-19 mRNA vaccines are marketed by 
Pfizer-BioNTech (BNT162b2; trade name: Comirnaty; generic 
name: tozinameran) and Moderna (mRNA-1273). The 
sequence of the former has been disclosed (Figure 2).'° The 
active payload of the Pfizer-BioNTech vaccine is a 4284 
nucleotide linear sequence of RNA consisting of five main 
elements:'! 
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Figure 1. (a) mRNA-based COVID-19 vaccine strategy. (b) Structural features of uridine and m1¥. TCR = T-cell receptor. MHC = major 


histocompatibility complex. 


A 5'-cap (m7(3’OMeG)(5’)ppp(5’)(2’OMeA)pG, com- 
monly referred to as trinucleotide “cap 1”) that helps 
recruit the ribosome and protect the RNA from 
degradation.” 


A S'-untranslated region (UTR) derived from the human 
a-globin mRNA with an optimized Kozak sequence that 
helps drive high levels of translation from the correct start 
codon. 


A codon-optimized coding sequence that specifies 
production of the transmembrane-anchored immuno- 
genic SARS-CoV-2 spike glycoprotein. 


A 3'-UTR conisting of two sequences derived from the 
amino-terminal enhancer of split mRNA and the 
mitochondrial encoded 12S rRNA, which aids high levels 
of protein expression by stabilizing the RNA. 


An unusual 3’-terminus consisting of two segmented 
poly(adenosine) tracts. The poly(adenosine) stretches 
increase mRNA stability, while the segmented structure 
helps reduce unwanted recombination during plasmid 
production.” 


The swift design of these vaccines has been deservedly 
celebrated.'®'” However, it is important to gently push back on 
the narrative that this process was hurried, which may invite 
skepticism. Each of the elements above were highly intentional 
choices that in many cases reflect decades of fundamental 
research in the RNA biology field.'*'? Below, we first review 
how these modified mRNAs are made, followed by an analysis of 
the modification’s biological effects. 
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Each of the elements [in the 
COVID-19 mRNA vaccines] were 
highly intentional choices that in 

many cases reflect decades of 
fundamental research in the RNA 

biology field. 


Mm INCORPORATION OF 
N1-METHYLPSEUDOURIDINE INTO MRNA 


VACCINES 


To evaluate the design above requires first overcoming a 
technical challenge: how does one produce (at scale) a synthetic 
mRNA with a linear sequence far longer than can be chemically 
synthesized while simultaneously preserving the flexibility to 
incorporate modified nucleobases such as m1¥? The answer has 
been to take a cue from nature and make them enzymatically 
(Figure 3). This approach takes advantage of the fact that DNA 
(which is far easier to synthesize than RNA) can be stitched 
together into large synthetic fragments. These fragments are 
used to construct plasmids, in which the code for the COVID-19 
vaccine is placed downstream of a sequence that promotes its 
transcription into mRNA by recombinant T7 RNA polymerase. 
By incubating these plasmids with T7 polymerase and 
nucleotide triphosphates (NTPs), high yields of mRNA are 
produced. Decades of research have characterized T7 polymer- 
ase as a remarkable enzyme, which can produce RNAs longer 
than 20 000 nucleotides without making an error.””*’ Another 
feature of T7 polymerase is its tolerance for non-natural NTPs. 
Over 50 years ago, Goldberg and Rabinowitz demonstrated that 
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Modified nucleotides 


5-cap 5-UTR 


BNT162b2 sequence (U = m1) 


GAGAAUAAAC 
UGAACCUGAC 
CUCUACCCAG 
GUGCUGCCCU 
GCCUGCUGAU 
CAAGAGCUGG 
CAGGGCAACU 
UGCCUCAGGG 
GACACCUGGC 
GGCACCAUCA 
GCAACUUCCG 
UGUGUACGCC 
UCCCCUACCA 
GCAAGAUCGC 
CAAUUACCUG 
GUGGAAGGCU 
AACUGCUGCA 
CGGCGUGCUG 
GAAAUCCUGG 
ACUGUACCGA 
UCUGAUCGGA 
AGAGCCAGAA 
CCAACUUCAC 
CUCCAACCUG 
GCCCAAGUGA 
GCUUCAUCGA 
GAUUUGCGCC 
ACAAGCGGCU 
UGUACGAGAA 
GGACGUGGUC 
AGACUGGACC 
CCGAGAUUAG 
GAUGAGCUUC 
CACGACGGCA 
CCACCGACAA 
AGAGGAACUG 
AUCGACCGGC 
UCUGGCUGGG 
CUGUGGCAGC 
CGCAAUGCUA 
CUGCUAGUUC 
UAAACGAAAG 
AAAAGCAUAU 


UAGUAUUCUU 
CACCAGAACA 
GACCUGUUCC 
UCAACGACGG 
CGUGAACAAC 
AUGGAAAGCG 
UCAAGAACCU 
CUUCUCUGCU 
GAUAGCAGCA 
CCGACGCCGU 
GGUGCAGCCC 
UGGAACCGGA 
AGCUGAACGA 
CGACUACAAC 
UACCGGCUGU 
UCAACUGCUA 
UGCCCCUGCC 
ACAGAGAGCA 
ACAUCACCCC 
AGUGCCCGUG 
GCCGAGCACG 
GCGUGGCCAG 
CAUCAGCGUG 
CUGCUGCAGU 
AGCAGAUCUA 
GGACCUGCUG 
CAGAAGUUUA 
GGACAUUUGG 
CCAGAAGCUG 
AACCAGAAUG 
CUCCUGAGGC 
AGCCUCUGCC 
CCUCAGUCUG 
AAGCCCACUU 
CACCUUCGUG 
GACAAGUACU 
UGAACGAGGU 
CUUUAUCGCC 
UGCUGCAAGU 
GCUGCCCCUU 
CAGACACCUC 
UUUAACUAAG 
GACUAAAAAA 


CUGGUCCCCA 
CAGCUGCCUC 
UGCCUUUCUU 
GGUGUACUUU 
GCCACCAACG 
AGUUCCGGGU 
GCGCGAGUUC 
CUGGAACCCC 
GCGGAUGGAC 
GGAUUGUGCU 
ACCGAAUCCA 
AGCGGAUCAG 
CCUGUGCUUC 
UACAAGCUGC 
UCCGGAAGUC 
CUUCCCACUG 
ACAGUGUGCG 
ACAAGAAGUU 
UUGCAGCUUC 
GCCAUUCACG 
UGAACAAUAG 
CCAGAGCAUC 
ACCACAGAGA 
ACGGCAGCUU 
CAAGACCCCU 
UUCAACAAAG 
ACGGACUGAC 
AGCAGGCGCC 
AUCGCCAACC 
CCCAGGCACU 
CGAGGUGCAG 
AAUCUGGCCG 
CCCCUCACGG 
UCCUAGAGAA 
UCUGGCAACU 
UUAAGAACCA 
GGCCAAGAAU 
GGACUGAUUG 
UCGACGAGGA 
UCCCGUCCUG 
CCAAGCACGC 
CUAUACUAAC 
AAAAAAAAAA 


CAGACUCAGA 
CAGCCUACAC 
CAGCAACGUG 
GCCAGCACCG 
UGGUCAUCAA 
GUACAGCAGC 
GUGUUUAAGA 
UGGUGGAUCU 
AGCUGGUGCC 
CUGGAUCCUC 
UCGUGCGGUU 
CAAUUGCGUG 
ACAAACGUGU 
CCGACGACUU 
CAAUCUGAAG 
CAGUCCUACG 
GCCCUAAGAA 
CCUGCCAUUC 
GGCGGAGUGU 
CCGAUCAGCU 
CUACGAGUGC 
AUUGCCUACA 
UCCUGCCUGU 
CUGCACCCAG 
CCUAUCAAGG 
UGACACUGGC 
AGUGCUGCCU 
GCUCUGCAGA 
AGUUCAACAG 
GAACACCCUG 
AUCGACAGAC 
CCACCAAGAU 
CGUGGUGUUU 
GGCGUGUUCG 
GCGACGUCGU 
CACAAGCCCC 
CUGAACGAGA 
CCAUCGUGAU 
CGAUUCUGAG 
GGUACCCCGA 
AGCAAUGCAG 
CCCAGGGUUG 
AAAAAAAAAA 


GAGAACCCGC 
CAACAGCUUU 
ACCUGGUUCC 
AGAAGUCCAA 
AGUGUGCGAG 
GCCAACAACU 
ACAUCGACGG 
GCCCAUCGGC 
GCCGCUUACU 
UGAGCGAGAC 
CCCCAAUAUC 
GCCGACUACU 
ACGCCGACAG 
CACCGGCUGU 
CCCUUCGAGC 
GCUUUCAGCC 
AAGCACCAAU 
CAGCAGUUUG 
CUGUGAUCAC 
GACACCUACA 
GACAUCCCCA 
CAAUGUCUCU 
GUCCAUGACC 
CUGAAUAGAG 
ACUUCGGCGG 
CGACGCCGGC 
CCUCUGCUGA 
UCCCCUUUGC 
CGCCAUCGGC 
GUCAAGCAGC 
UGAUCACAGG 
GUCUGAGUGU 
CUGCACGUGA 
UGUCCAACGG 
GAUCGGCAUU 
GACGUGGACC 
GCCUGAUCGA 
GGUCACAAUC 
CCCGUGCUGA 
GUCUCCCCCG 
CUCAAAACGC 
GUCAAUUUCG 
AAAAAAAAAA 


Coding sequence 


CACCAUGUUC 
ACCAGAGGCG 
ACGCCAUCCA 
CAUCAUCAGA 
UUCCAGUUCU 
GCACCUUCGA 
CUACUUCAAG 
AUCAACAUCA 
AUGUGGGCUA 
AAAGUGCACC 
ACCAAUCUGU 
CCGUGCUGUA 
CUUCGUGAUC 
GUGAUUGCCU 
GGGACAUCUC 
CACAAAUGGC 
CUCGUGAAGA 
GCCGGGAUAU 
CCCUGGCACC 
UGGCGGGUGU 
UCGGCGCUGG 
GGGCGCCGAG 
AAGACCAGCG 
CCCUGACAGG 
CUUCAAUUUC 
UUCAUCAAGC 
CCGAUGAGAU 
UAUGCAGAUG 
AAGAUCCAGG 
UGUCCUCCAA 
CAGACUGCAG 
GUGCUGGGCC 
CAUAUGUGCC 
CACCCAUUGG 
GUGAACAAUA 
UGGGCGAUAU 
CCUGCAAGAA 
AUGCUGUGUU 
AGGGCGUGAA 
ACCUCGGGUC 
UUAGCCUAGC 
UGCCAGCCAC 
AAAAAAAAAA 


3'-UTR poly(A) tail 


AGCCAGUGUG 
GCGUGCUGCA 
CGACAACCCC 
AAGACCCAGA 
ACAAGAACAA 
GGAAGGCAAG 
GUGCGGGAUC 
GAAGCUACCU 
CAACGAGAAC 
UACCAGACCA 
GAUUCGCCUC 
CUACGGCGUG 
GGACAGACAG 
GCGGCAACUA 
UUGUAACGGC 
CUGAGCUUCG 
UGACCGGCAC 
CCAGACACUG 
CAGGACGUGA 
GAGCCGGCUG 
CAGCCCUCGG 
GCUAUCCCCA 
CCACCGAGUG 
AGAGGUGUUC 
AGCAAGCGGA 
CCAGGGAUCU 
CGGCACAAUC 
CAGAAUGUGC 
GAAAGCUGCA 
UAUCCUGAGC 
AUCAGAGCCG 
GCUACCACCU 
AGCCAUCUGC 
CAGAUCAUCA 
ACAGCUUCAA 
CCAGAAAGAG 
CCCUGGUACA 
GCUGUUGUAG 
ACUGCAUGCA 
CUCACCACCU 
CCUUUAGCAA 
AAAAAAAAAA 


GUGUUCCUGG 
UGUACUACCC 
CGUGUCCGGC 
GGCUGGAUCU 
GCAACGACCC 
GUACGUGUCC 
AUCUACAGCA 
CCCGGUUUCA 
CCUGCAGCCU 
CUGAAGUCCU 
GCCCCUUCGG 
CAACUCCGCC 
CGGGGAGAUG 
GGAACAGCAA 
CACCGAGAUC 
GUGGGCUAUC 
ACAAAUGCGU 
CGCCGAUACC 
AACACCAGCA 
ACUCCACCGG 
AAUCUGCGCC 
AACAGCGUGG 
UGGACUGCAC 
GAUCGCCGUG 
AGCCAGAUUC 
AGUAUGGCGA 
GAUCGCCCAG 
GCCUACCGGU 
ACAGCCUGAG 
CUUCGGCGCC 
AGCCUCCAGA 
AGAGCAAGAG 
CGCUCAAGAG 
UUCGUGACAC 
CCGUGUACGA 
CAGCGGAAUC 
CUGGGGAAGU 
GCAUGACCAG 
ACUGCACUAC 
CCAGGUAUGC 
CACACCCCCA 
ACCCUGGAGC 
AAAAAAAAAA 


UGCUGCUGCC 
CGACAAGGUG 
ACCAAUGGCA 
UCGGCACCAC 
CUUCCUGGGC 
CAGCCUUUCC 
AGCACACCCC 
GACACUGCUG 
AGAACCUUCC 
UCACCGUGGA 
CGAGGUGUUC 
AGCUUCAGCA 
AAGUGCGGCA 
CAACCUGGAC 
UAUCAGGCCG 
AGCCCUACAG 
GAACUUCAAC 
ACAGACGCCG 
AUCAGGUGGC 
CAGCAAUGUG 
AGCUACCAGA 
CCUACUCCAA 
CAUGUACAUC 
GAACAGGACA 
UGCCCGAUCC 
UUGUCUGGGC 
UACACAUCUG 
UCAACGGCAU 
CAGCACAGCA 
AUCAGCUCUG 
CAUACGUGAC 
AGUGGACUUU 
AAGAAUUUCA 
AGCGGAACUU 
CCCUCUGCAG 
AAUGCCAGCG 
ACGAGCAGUA 
CUGCUGUAGC 
ACAUGAUGAC 


UCUGGUGUCC 
UUCAGAUCCA 
CCAAGAGAUU 
ACUGGACAGC 
GUCUACUACC 
UGAUGGACCU 
UAUCAACCUC 
GCCCUGCACA 
UGCUGAAGUA 
AAAGGGCAUC 
AAUGCCACCA 
CCUUCAAGUG 
GAUUGCCCCU 
UCCAAAGUCG 
GCAGCACCCC 
AGUGGUGGUG 
UUCAACGGCC 
UUAGAGAUCC 
AGUGCUGUAC 
UUUCAGACCA 
CACAGACAAA 
CAACUCUAUC 
UGCGGCGAUU 
AGAACACCCA 
UAGCAAGCCC 
GACAUUGCCG 
CCCUGCUGGC 
CGGAGUGACC 
AGCGCCCUGG 
UGCUGAACGA 
CCAGCAGCUG 
UGCGGCAAGG 
CCACCGCUCC 
CUACGAGCCC 
CCCGAGCUGG 
UCGUGAACAU 
CAUCAAGUGG 
UGCCUGAAGG 
UCGAGCUGGU 
UCCCACCUCC ACCUGCCCCA 
CGGGAAACAG CAGUGAUUAA 
UAGCAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAA 
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Figure 2. Top: Design elements found in synthetic mRNA therapeutics. Bottom: Sequence of the COVID-19 mRNA vaccine tozinameran 
(BNT 162b2) from Pfizer/BioNTech. Green: 5’-cap. Yellow: 5’- and 3'-UTR sequences. Blue: SARS-CoV-2 spike glycoprotein coding sequence. Red: 


Segmented poly(A) tail. 


RNA polymerases can incorporate pseudouridine triphosphate 
into RNA.” In one early study (which makes one quite thankful 
for the Sigma-Aldrich catalogue), pseudouridine was isolated 
from 20 L of urine donated by patients with leukemia, 
polycythemia, or gout, converted to a radiolabeled triphosphate 
by a mixed chemoenzymatic approach, and found to replace 
uridine in RNA during in vitro transcription when UTP was 
omitted.’ Early studies of T7 RNA polymerase found it was also 
permissive of modified NTPs that do not alter base pairing, * 
and this strategy has since been applied to many different 
bases.”°*? One caveat to this enzymatic approach is that it 
replaces the natural nucleobase with a non-natural residue 
homogeneously; in the case of BNT162b2, every uridine residue 
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in the mRNA is replaced with m1¥. This means to be useful in a 
therapeutic mRNA, a modified nucleobase must be compatible 
with all of its functional elements, including UTRs and the 
coding sequence recognized by the ribosome. With this 
understanding of the primary sequence of modified mRNA 
vaccines and how they are produced, we can proceed to a 
discussion of what they do. 


E N1-METHYLPSEUDOURIDINE REDUCES MRNA 
IMMUNOGENICITY 


Early studies showed that synthetic mRNAs entrapped in 
cationic lipid vesicles can be transfected into cultured cells.” 
When injected into mouse muscle, reporter mRNAs produced 


https://doi.org/10.1021/acscentsci.1c00197 
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Figure 3. Production of m1¥ mRNAs by in vitro transcription. Left: Components of in vitro transcription reaction. Right: Incorporation of m1¥- 
triphosphate into RNA is guided by m1'®’s ability to form a canonical base pair with adenine of the DNA template in the T7 RNA polymerase active 
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Figure 4. (a) Activation of innate immune response by mRNA secondary structures (b) Structure of the single-stranded RNA sensor TLR7 in complex 
with a polyuridine (poly(U)) ligand (PDB ID: SGMF). Replacing uridine with m1¥ demonstrates the steric incompatibility of the modified 


nucleobase with TLR7 binding and immune activation. 


detectable proteins for weeks.*’ However, a challenge to 
application of these agents as vaccines and protein replacement 
therapies was their immunogenicity. Cells contain a variety of 
pattern recognition receptors whose natural role is to identify 
and respond to viral RNAs by inducing downstream signaling. 
These include the endosomal receptors TLR3, TLR7, and 
TLRS, which recognize double- and single-stranded RNA, and 
the cytosolic receptors RIG-I and MDA-S, which recognize 
double-stranded and 5’-triphosphate-modified RNA. While 
induction of an immune response is theoretically a positive 
attribute for a vaccine, uncontrolled immune activation can lead 
to allergic reactions and anaphylactic shock. Furthermore, at a 
molecular level, overstimulation of immune signaling is known 


to silence protein translation, with the potential outcome of 
limiting antigen expression and vaccine efficacy. A breakthrough 
came from the fundamental studies of Kariko and co-workers, 
who showed that many modifications naturally found in human 
RNA such as pseudouridine, thiouridine, and 5-methylcytidine 
reduce its immunostimulatory potential.” This inspired follow- 
up studies demonstrating that these same nucleobase 
modifications could increase protein production from synthetic 
mRNAs* > and be applied in many applications, including the 
generation of induced pluripotent stem cells.°°~** Further 
development of this concept led to m1¥, which in mRNA was 
found to increase protein output while decreasing TLR3 
activation.” The ability of m1 and related modifications to 
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reduce the immunogenicity of synthetic mRNA has been 
attributed to at least three mechanisms (Figure 4): 


A breakthrough came from the 
fundamental studies of Kariko 
and co-workers, who showed 

that many modifications naturally 
found in human RNA. . .reduce its 
immunostimulatory potential. 


e Reduced synthesis of antisense RNA: Under high- 
yielding conditions, T7 RNA polymerase sometimes uses 
the RNA it has produced to “self-prime”, leading to the 
synthesis of small amounts of duplexed antisense mRNA 
(Figure 4).*° Removal of these double-stranded RNA 
impurities by chromatography does not eliminate differ- 
ences in immunogenicity observed between m1W- 
modified and unmodified RNAs but does reduce it.** 
Other studies have also found that using base-modified 
NTPs yields noninflammatory mRNAs without the need 
for purification.*** This suggests that using the non- 
natural NTP for RNA synthesis may disfavor this side 
product. 

e Altering interaction with RNA secondary structure: In 
addition to antisense impurities, mRNA can form 
secondary structures such as hairpins that may be 
recognized by immune receptors such as TLR3 and 
RIG-I (Figure 4).*? Incorporation of modified bases has 
the potential to reduce these recognition events by 
altering secondary structure and protein/double-stranded 
RNA interactions. In the related C-glycoside pseudour- 
idine, isomerization shifts the structural equilibrium of the 
nucleotide toward a C3’-endo ribose sugar and an anti 
orientation of the base, a conformation that favors helicity 
and stacking.“*~*° Consistent with this, a recent study 
used chemical probing reagents to find evidence that 
RNAs containing m1Y and uridine form distinct 
secondary structures.“ Modified nucleotides have also 
been found to reduce the ability of mRNAs to propagate 
immune signaling through RIG-I, indicative of their 
ability to influence protein—RNA interactions.” 

e Altering interaction with single-stranded RNA immune 
receptors: In immune cells, single-stranded poly(uridine) 
RNA is one of the most potent inducers of interferon and 
is sensed by TLR7.*”°° To define whether m1 alters 
immune recognition of single-stranded RNA, a recent 
study assessed the ability of RNAs containing this species 
to activate inflammatory gene expression.*” 7 To ensure 
any differences were not due to double-stranded RNA, the 
authors employed a mouse model where the immune 
response to these structures was silenced. Even in the 
absence of double-stranded RNA sensing, m1‘¥ RNAs 
were less inflammatory than those containing canonical 
uridine. This suggests the altered hydrogen bonding face 
and steric “bump” presented by m1 disrupts the 
interaction of immune sensors such as TLR7 with 
single-stranded segments of synthetic mRNA (Figure 4c). 


It is important to note that in many studies, the specific 
contributions of each of these mechanisms to mRNA 
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immunogenicity have not been explicitly defined. In such 
cases, an mRNA modification may be exerting its activity by 
altering antisense transcript synthesis, mRNA structure, immune 
recognition, or some combination thereof. 

Vaccines often require coadministration of adjuvants, which 
are agents that prime the immune system to respond to an 
antigen of interest. In the case of tozinameran and mRNA-1273, 
this role appears to be fulfilled by the lipid nanoparticle, which 
can be tailored to predictably activate the immune response via 
mechanisms that do not halt protein production.°’~>° 
Separating the adjuvant from the nucleic acid component of 
the vaccine reduces the chance that the mRNA sequence 
composition may influence vaccine efficacy. This potentially 
increases the strategy’s generality and also opens the door to 
other applications, such as treatment of autoimmune disorders** 
and therapeutic protein replacement.” 

Interestingly, at least two groups have reported that 
pseudouridine, the natural analogue of m1, does not 
measurably alter mRNA immunogenicity in vivo’ and that 
many of the benefits of m1¥ can be obtained by simply 
engineering a synthetic mRNA’s sequence to limit the use of 
uridine-containing codons.” A comparative analysis of codons 
used in tozinameran relative to the spike glycoprotein encoded 
by the SARS-CoV-2 genome observes a disproportionate 
depletion of uridine residues, indicative of sequence engineering 
(Figure S1). In the context of the COVID-19 vaccine, the 
relative effects of sequence engineering and m1¥ incorporation 
on the immunogenic mechanisms specified above remains to be 
reported. 


E N1-METHYLPSEUDOURIDINE CAN ALTER MRNA 
TRANSLATION 


The ultimate purpose of an mRNA medicine is to express a 
therapeutic protein. Thus, m1¥ and other modified bases have 
been explored for their ability to facilitate the translation of 
mRNA into protein via the ribosome. These studies are naturally 
intertwined with those above, as immune activation can limit 
translation by shutting down the ribosome and activating 
ribonucleases that degrade mRNA. Consistent with this, in the 
initial report where m1‘¥-containing mRNA was found to drive 
high levels of protein production, this was attributed in part to its 
ability to blunt TLR3 activation.*’ To decouple translation and 
immune activation, Svitkin and co-workers analyzed the 
translation of m1¥ mRNAs in a cell-free translation system.** 
They observed that incorporation of m1¥ increases the size and 
abundance of polysomes, leading them to propose that the more 
rapid translation initiation and slower elongation of m1¥ 
mRNAs may coordinately increase their half-life as well as 
induce productive interactions with the ribosome. These studies 
provided the first evidence that m1¥ may directly impact 
mRNA translation. 

Natural RNA modifications are known to be context- 
dependent.” This means they can exert different effects on 
different RNAs. Those effects may also be dependent on where 
in the RNA they lie (e.g, UTR, coding sequence). Two 
studies have examined the context-dependence of m1 in a 
high-throughput fashion (Figure 5)” In the first, Sample et 
al. used RNA sequencing of polysomes to compare how a library 
of uridine and m1‘¥ mRNAs containing 280 000 different 5'- 
UTRs was loaded onto the ribosome. Across all sequences 
tested, ribosome loading was found to be anticorrelated with 
predicted mean free energy. This is consistent with the classical 


view that structured 5’-UTRs can repress translation.” 
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Figure 5. m1¥ exerts context-dependent effects on translation. Left: 
m1-dependent enforcement of secondary structure in the 5'-UTR of 
synthetic mRNAs can inhibit translation initiation. Right: m1'P- 
dependent enforcement of secondary structure in the coding sequences 
of synthetic mRNAs can increase their functional half-life. Note: While 
m1¥ is homogeneously incorporated throughout synthetic mRNA 
vaccines, in these illustrations, m1¥ is only specified in duplexes to 
emphasize its potential to influence mRNA structure. 


However, this anticorrelation was stronger for ml than 
uridine, indicating that by stabilizing RNA structure, the 
modified base may actually decrease protein production in 
these contexts.*°°* A second study by Mauger et al. examined 
the relationship between m1, RNA structure, and protein 
production in even greater detail. They evaluated modified 
(m1, pseudouridine, methoxyuridine) and unmodified (ur- 
idine) mRNAs across multiple synonymous versions of three 
different reporters, amounting to over 150 synthetic mRNAs in 
total. Within this library, modified and unmodified mRNAs were 
found to exhibit distinct “fingerprints” of codon optimality. 
Assuming uridine and ml are decoded similarly by the 
ribosome, this suggested that a feature other than codon 
optimality is responsible for tuning synthetic mRNA translation. 
To examine the potential role of structure in this process, the 
authors used a biochemical probing technique (SHAPE-MaP)* 
to study modified and unmodified mRNAs. As in the case of 5’- 
UTRs, it was found that m1 stabilizes structure. Further 
studies provided support for a model in which secondary 
structure in the coding sequence, which can be enforced by 
ml1¥, may increase mRNA functional half-life independent of 
codon optimality.*” 

One important aspect revealed by these studies is that m1¥ is 
not a panacea for protein production. While for most mRNA 
sequences m1 performed as well or better than uridine, in 
some it performed worse. Similar observations have been made 
for pseudouridine, which in one study was found to be 
incompatible with protein output from mRNAs containing 
structured viral internal ribosomal entry sites in their S’-UTR 
region. The efficient translation of many different m1¥- 
containing mRNAs suggests that the secondary structures 
induced by this modification do not activate immune sensors. 
This may reflect their small size or greater dynamics relative to 
the stable duplexes found in classic TLR3 agonists such as 
poly(I:C) or the intrinsic ability of m1¥ to impede the protein— 
mRNA interactions responsible for immune activation. 


© CONCLUSIONS 


The shock of the COVID-19 pandemic mobilized the 
biomedical research community on an unprecedented scale 
and enabled the most rapid vaccine production process in 
human history. This success also presents a unique challenge to 
scientific communication, which is how to highlight the decades 
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of fundamental research that underlie these medicines. In this 
Outlook, we describe for a scientific lay audience the 
development and application of m1'¥, a chemical component 
of COVID-19 mRNA vaccines. The modified nucleobase helps 
cloak mRNA vaccines from the immune system, limiting their 
undesired immune stimulation, and in certain circumstances 
may also enhance the synthesis of antigens by the protein- 
producing machinery of the cell. This allows these vaccines to 
tap into the natural process of mRNA translation without 
triggering harmful side effects such as anaphylaxis. 

In light of the current concern over emerging SARS-CoV-2 
variants, it is worth highlighting how synthetic mRNAs are being 
developed for use in personalized cancer immunotherapy.” 
In this approach, clinicians remove a tumor, sequence it to 
identify coding mutations, and use this information to design 
custom mRNAs that express those mutant peptides at high 
levels, which helps train the immune system to selectively attack 
tumor tissue." In other words, synthetic mRNA platforms have 
been built with the express purpose of rapidly addressing newly 
discovered mutations. This bodes well for the potential of these 
medicines to be reconfigured to combat emerging viral strains 
and suggests one unexpected legacy of this pandemic may be to 
accelerate the use of synthetic mRNAs in cancer treatment. 

Finally, our review of m1 highlights future areas where 
chemical innovation may help extend the reach of therapeutic 
mRNAs. First, while the modular nature of mRNA vaccines has 
led to considerable enthusiasm, the combinatorial space of 
elements that contribute to their activity (including caps, coding 
sequence, codons, UTRs, and modifications) is massive in scale, 
and relatively few RNA modifications have been comparatively 
evaluated in a systematic manner. High-throughput approaches 
will be critical to help define this space and develop optimized 
agents.” The exploration of novel nucleobases may be also be 
aided by efficient routes to nucleoside triphosphates” as well as 
biological insights arising from the recent renaissance in the 
study of endogenous mRNA modifications.” The production 
of novel mRNA therapies may also be aided by the evolution of 
RNA polymerases with improved synthetic properties such as 
expanded nucleobase tolerance or a reduced production of 
antisense transcripts. The successful engineering of DNA 
polymerases for genome sequencing speaks to the feasibility 
and potential impact of this goal. 

Almost 60 years ago in “Meditations in an Emergency” the 
poet Frank O’Hara wrote, “I am needed by things as the sky must 
be above the earth./And lately, so great has their anxiety become, I 
can spare myself little sleep.” O’Hara’s passage resonates with our 
current era and the tremendous strain felt by patients, families, 
and healthcare providers during this pandemic. The nucleobase 
m1, a “modification in an emergency”, provides an example of 
how contemplation can also lead to intervention, offering hope 
and rest in a time of crisis. 
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M@ NOTE ADDED IN PROOF 


A putative sequence for the Moderna vaccine was reported while 
this manuscript was in production.” Although this sequence has 
not been validated, the initial report does include the statement 
that, “the RNAs that are now a part of the human ecosystem and 
that are likely to appear in numerous other high throughput 
RNA-seq studies in which a fraction of the individuals may have 
previously been vaccinated.” In our view this is likely erroneous, 
as there is no evidence for long-term detection of mRNA 
vaccines in vaccinated individuals by RNA-seq. Indeed, all 
experimental evidence to date supports the view that synthetic 
mRNAs are efficiently destroyed by the body in the days 
following after vaccination. 
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